Active flash: towards energy-efficient, in-situ data analytics on extreme-scale machines

نویسندگان

  • Devesh Tiwari
  • Simona Boboila
  • Sudharshan S. Vazhkudai
  • Youngjae Kim
  • Xiaosong Ma
  • Peter Desnoyers
  • Yan Solihin
چکیده

Modern scientific discovery is increasingly driven by large-scale supercomputing simulations, followed by data analysis tasks. These data analyses are either performed offline, on smaller-scale clusters, or in-situ, on the supercomputer itself. Both of these strategies are rife with storage and I/O bottlenecks, energy inefficiencies due to increased data movement, and increased time to solution. We propose Active Flash, a novel approach to in-situ, scientific data analysis, wherein data analysis is conducted on the solid-state device (SSD), where the data already resides. Our performance and energy models show that Active Flash has the potential to address many of the aforementioned concerns. In addition, we demonstrate an Active Flash prototype built on a commercial SSD controller, which further reaffirms the viability of our proposal.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Reducing Data Movement Costs Using Energy-Efficient, Active Computation on SSD

Modern scientific discovery often involves running complex application simulations on supercomputers, followed by a sequence of data analysis tasks on smaller clusters. This offline approach suffers from significant data movement costs such as redundant I/O, storage bandwidth bottleneck, and wasted CPU cycles, all of which contribute to increased energy consumption and delayed end-toend perform...

متن کامل

FlexIO: Location-flexible Execution of In Situ Data Analytics for Large Scale Scientific Applications

Increasingly severe I/O bottlenecks on High-End Computing machines are prompting scientists to process simulation output data while simulations are running and before placing data on disk – ”in situ” and/or ”in-transit”. There are several options in placing in-situ data analytics along the I/O path: on compute nodes, on staging nodes dedicated to analytics, or after data is stored on persistent...

متن کامل

Analytical Cost Metrics : Days of Future Past

As we move towards the exascale era, the new architectures must be capable of running the massive computational problems efficiently. Scientists and researchers are continuously investing in tuning the performance of extreme-scale computational problems. These problems arise in almost all areas of computing, ranging from big data analytics, artificial intelligence, search, machine learning, vir...

متن کامل

In situ data analysis and I/O acceleration of FLASH astrophysics simulation on leadership-class system using GLEAN

The performance mismatch between computing and I/O components of current-generation HPC systems has made I/O a critical bottleneck for scientific applications. It is therefore critical to make data movement as efficient as possible and, inorder to facilitate simulation-time data analysis and visualization to reduce the data written to storage. These issues will be of paramount importance to ena...

متن کامل

A New Method for Ranking Extreme Efficient DMUs Based on Changing the Reference Set with Using L2 - Norm

The purpose of this study is to utilize a new method for ranking extreme efficient decision making units (DMUs) based upon the omission of these efficient DMUs from reference set of inefficient and non-extreme efficient DMUs in data envelopment analysis (DEA) models with constant and variable returns to scale. In this method, an L2- norm is used and it is believed that it doesn't have any e...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013